Compacting XML Documents
نویسندگان
چکیده
Nowadays one of the most common formats for storing information is XML. The size of XML documents can be rather large, and they may contain redundant attributes which can be calculated from others. The main idea behind our paper is based on a relationship between XML documents and attribute grammars. Using this relationship it is possible to define semantic rules for XML attributes using a metalanguage called SRML. With this metalanguage we decided to develop a method for compacting XML documents. After compaction it is possible to use XML compressors to make the compacted document smaller, thus increasing the potential compression ratio of the compressors. Devising the rules can be done manually or by a machine learning approach. Our method can be viewed as a form of data mining, meaning that it can find relationships between attributes which might not have been noticed by the user beforehand.
منابع مشابه
An approach for compacting XMI documents
One of the most common formats for storing information is XML. It is used in many areas, with its spectrum expanding day by day. A big drawback of the XML format is that the documents can be quite large. This causes problems wherever size is an important issue, for example in embedded systems or whenever the document has to be transferred over a network. Another widely used format is XMI (XML M...
متن کاملCompacting XML Structures Using a Dynamic Labeling Scheme
Due to the growing popularity of XML as a data exchange and storage format, the need to develop efficient techniques for storing and querying XML documents has emerged. A common approach to achieve this is to use labeling techniques. However, their main problem is that they either do not support updating XML data dynamically or impose huge storage requirements. On the other hand, with the verbo...
متن کاملخوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملPrototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کاملCompacting XML Data
Doubleday The Da Vinci Code Dan Brown Pocket Star Angels & Demons Dan Brown Dan Brown The Da Vinci Code Doubleday An...
متن کامل